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o\ : 

A semantic framework for analyzing safe composition of distributed programs is presented. 
• Its applicability is illustrated by a study of program composition when communication is reli- 

Q | able but not necessarily FIFO. In this model, special care must be taken to ensure that messages 

^ ' do not accidentally overtake one another in the composed program. We show that barriers do 

CJ ■ not exist in this model. Indeed, no program that sends or receives messages can automatically 

be composed with arbitrary programs without jeopardizing their intended behavior. Safety of 
composition becomes context-sensitive and new tools are needed for ensuring it. A notion of 
sealing is defined, where if a program P is immediately followed by a program Q that seals 
P then P will be communication-closed — it will execute as if it runs in isolation. The investi- 
gation of sealing in this model reveals a novel connection between Lamport causality and safe 
composition. A characterization of sealable programs is given, as well as efficient algorithms 
t*"*- . for testing if Q seals P and for constructing a seal for a significant class of programs. It is shown 

that every sealable program that is open to interference on 0(n 2 ) channels can be sealed using 
0(n) messages. 

> : 

1. Introduction 

& : 

Much of the distributed algorithms literature is devoted to solutions for individual tasks. Implicitly it may 
appear that these solutions can be readily combined to create larger applications. Composing such solutions 
is not, however, automatically guaranteed to maintain their correctness and their intended behavior. For 
example, algorithms are typically designed under the assumption that they begin executing in a well-defined 
initial global state in which all channels are empty. In most cases, the algorithms are not guaranteed to 
terminate in such a state. Another inherent feature of distributed systems is that, even though they are often 
designed in clearly separated phases, these phases typically execute concurrently. For instance, Lynch writes 



> 



in |Lyn96,p. 523]: 
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"An MST algorithm can be used to solve the leader-election problem [...]. Namely, after estab- 
lishing an MST, the processes participate in the STtoLeader protocol to select the leader. Note 
that the processes do not need to know when the MST algorithm has completed its execution 
throughout the network; it is enough for each process i to wait until it is finished locally, [. . .]." 

In general, when two phases, such as implementations of an MST algorithm and of the STtoLeader al- 
gorithm, are developed independently and then executed in sequence, one phase may confuse messages 
originating from the other with its own messages. Perhaps the first formal treatment of this issue was via 
the notion of communication-closed layers introduced by Elrad and Francez in [EF82]. Consider a program 
P = P\ || . . . || P„ consisting of n concurrent processes P, = QcLf, <2;, the execution of which is, intuitively, 
divided into three phases, Q\, Li, and Q\. Elrad and Francez define L = L\ || ... |[ L n to be a communication- 
closed layer (CCL) in P if under no execution of P does a command in some L, communicate with a com- 
mand in any Qj or Q'j EF82II . If a program P can be decomposed into a sequence of CCLs then every 
execution of P can be viewed as a concatenation of executions of P's layers in order. Hence, reasoning 
about P can be reduced to reasoning about its layers in isolation. This approach has been investigated fur- 
ther and applied to a variety of problems by Janssen, Poel, and Zwiers [ JPZ91 , Jan94l Uan95 [ IPZ921 . Stomp 
and de Roever considered related notions in the context of synchronous communication [SdR94|. Gerth and 
Shrira considered the issue of using distributed programs as off-the-shelf components to serve as layers in 
larger distributed programs [GS86]. They observe that the above definition of CCL is made with respect 
to the whole program P as context, and hence is unsuitable for off-the-shelf components. They solve the 
problem by defining L to be a General Tail Communication Closed (GTCC) layer if, roughly speaking, for 
all layers T\ \\ . . . \\ T n we have that L is a CCL in L\ ; T\ || ... || L n ;T n . Since this definition does not refer 
to the surrounding program context of a layer, it asserts a certain quality of composability. Sequentially 
composing GTCC layers guarantees that each one of them is a CCL. 

We develop a framework for defining and reasoning about various notions central to the design of CCLs 
in different models of communication. The communication model used in most of the literature concerning 
CCLs is that of reliable FIFO channels. In practice, channels often fail to satisfy this assumption. Three 
main sources of imperfection are loss, reordering, and duplication of messages by a channel. This paper 
studies the impact of message reordering on the design of CCLs. Our communication model, which we 
call Rel, will therefore assume that channels neither lose nor duplicate messages but message delivery is 
not necessarily FIFO. As we shall see, in Rel, the CCL property depends in an essential way on Lamport 
causality [Lam78 ]. Indeed, to ensure CCL, causality is all that is needed in Rel, whereas either duplication 
or loss already mandate the need for headers in messages [FL90, EM05c|. 

Consider for instance the task of transmitting a message 
, m from process i to process j where it is stored in variable 

(Jy* [ snd~] rev H-»- ( snd ) >Q x. The task is accomplished by i performing SND l m * J to send 

\ I / tne message and j performing RCV^ 1 to receive it into vari- 

(7)-^»- f~rcv~^} J( snd ") 1 ►Q a ^ e This implementation denoted MT^ X (for Message- 

M ji^j | mt ; ^" ' Transmit) works fine in isolation. Composing two copied 

' — ' 1 ' of mt'^, however, does not guarantee the same behavior 

as executing the first to completion and then executing the 
Figure 1: MT J ' seals mt' 1 . second. Since communication is not FIFO, the second mes- 

sage sent by i could be the first one received by j. On the 
other hand, if MT 1 ^ 7 is followed by MT 7 ^ 1 no such interference occurs. Moreover, no later program can 
ever interfere with the first MT ! ^ J in this pair. Of course the second program, MT^', is still susceptible to 
interference, e.g., by another MT 7_ In fact, non-trivial programs are never safe from interference in Rel. 
As we shall show, for any terminating program P transmitting a message from i to j there is a program Q 



1 We omitted the subscript in MT 1- *-'. Whenever a parameter is irrelevant to the point being made, we tend to omit it. 
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potentially interfering with communication in P. One consequence is that no terminating program that sends 
messages can be a GTCC layer. 

The above discussion suggests that it is necessary to inspect the next layer in order to determine whether 
a given layer is a CCL. In fact, we shall define a notion of a program Q sealing its predecessor P, which will 
ensure that P is a CCL in P immediately followed by Q. For example, MT^' seals MT ! ^ ; and vice versa. 
Intuitively, Q seals P if Q guarantees that no message sent after P can be received in P. Let us consider 
why mt{^ seals MT 1 ^. Suppose that a later message is sent on the channel from i to j as in Fig.Q] This 
send is performed only after the message sent in the opposite direction has been received by i, which in 
turn must have been sent after the first message has been received by j. Consequently, fs receive event 
must precede f s sending of the later message. Therefore, the later message cannot compete with the earlier 
one. A message transmitted in the opposite direction is often called an acknowledgment. More interesting 
examples of sealing are presented in Figures [2(a)] and [3] For a decomposition of a program P into a sequence 
of £ layers L^,. . . ,L^\ it follows that if L^ +1 ) seals L« for all 1 < k < I then each layer is a CCL in 
P. 

In [Lam78 ] Lamport defined causality among events of asynchronous message passing systems. Causality 
implies temporal precedence. As discussed above, transmitting an acknowledgment guarantees that the 
receive of the first message causally precedes any later sends on the same channel. Observe that the same 
effect could be obtained by other means ensuring the intended precedence. For instance, a causal chain 
consisting of a sequence of messages starting at j, going through a number of intermediate processes, and 
ending at i could be used just as well. While this transitive form of acknowledgment appears to be inefficient, 



a given message can play a role in a number of transitive acknowledgments. Fig. |2(a)| illustrates a program 
consisting of the transmission of three messages over three different channels. It is sealed using transitive 
acknowledgments by the program displayed in Fig. |2(b)[ which sends only two messages. 



->- ( rev ) — * { snd ) ►Q 

\ , _ 

O f rev ) ► :^cv~> *Q Q- K snd ) 

(a) A program P. (b) A seal for P. 





Figure 2: An example of sealing. 

Indeed, we shall later show how 0(n) messages can usefully substitute for Q.{n 2 ) acknowledgments. Not 
all programs can be sealed. We shall later prove that program X shown in Fig. |3(a)| is unsealable. The 
same program executed in the presence of a third process as in Fig. |3(b)| is, however, sealable. Any seal 
of this program will necessarily use transitive acknowledgments as discussed above. See Fig. |3(c)| for an 
illustration of one way this program can be sealed. 

Contributions. The first main contribution of this paper is in the presentation of a framework studying safe 
composition of layers of distributed programs in different models of communication. Within the framework 
we define notions including CCL and barriers. Moreover, it is possible to define new notions such as 
sealing that play an important role in ensuring safe composition. In this paper the power of the framework 
is illustrated by a comprehensive study of safe composition in Rel. In a companion paper [EM05bl the 
framework is used to define additional notions that are used to study safe composition in FIFO-models with 
duplicating and/or lossy channels. 
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(a) Unsealable program X 



Q — I rind 




o 



Q — t ( snd 



Q — ► ( rcv ) -► ( snd 




snd ) — ►Q 



(b) A sealable program/''. 



(c) A seal fori 3 '. 



Figure 3: An example of a program for two processes that is unsealable unless a third process is added. 

Our second main contribution is in identifying the notion of sealing and demonstrating its central role in 
the design of CCLs in Rel. We study the theory of sealing in Rel and present the following results. 

• Sealable straight-line programs are completely characterized. 

• A definition of the sealing signature of straight-line programs is given, which characterizes the sealing 
behavior of a program concisely, for both purposes, sealing and being sealed. The size of the signature 



• An algorithm for deciding whether Q seals P based only on their signatures is presented. 

• An algorithm for constructing seals for sealable straight-line programs is presented. It produces seals 
that perform less than 3n message transmissions even though Q.(n 2 ) channels may need to be sealed. 

The restriction to straight-line programs is motivated by the undecidability of the corresponding problems 
for general programs. Specifically, the halting problem can be reduced to each of these problems for general 
programs. As far as communication closure is concerned, straight-line programs already display most of the 
interesting aspects relevant to the subject of sealing. 

2. A Model of Distributed Programs with Layering 

In this section we define a simple language for writing message-passing concurrent programs. Its compo- 
sition operator "*" is called layering. Layering subsumes the two more traditional operators ";" and "||" 
(as discussed by Janssen in [Jan94]). The meaning of P * Q is that each process i first executes its share of 
P and then proceeds directly to execute its share of Q. In particular, layering does not impose any barrier 
synchronization between P and Q. In other words, in P * Q process i need not wait for any other processes 
to finish their shares of P before moving on to Q. Consequently, programs execute between cuts rather than 
global states. We shall define a notion r[c,d] lh P of a program P occurring over an interval r[c,d] between 
the cuts c and d of a run r. 

Our later analysis will be concerned with CCLs P. Thus we need to ensure that no message crosses any 
initial or final cut of an interval over which P occurs. A concise way of capturing this formally is via a new 
language construct, the silent cut , \. Writing I specifies that all communication channels are empty at this 
cut. In other words, no statement to the left of the I can communicate with a statement to the right. If P is a 



is 0{n 2 ). 
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CCL in a given larger program L then every execution of P in L is also an execution of iPl In other words, 
P can be substituted for IPl in l|1 We adopt a standard notion of refinement to indicate substitutability 
of programs. Program P refines program Q if every execution of P over an interval r[c,d] is also one of 
Q, regardless of what happens before c and after d. The notions of "*", and refinement provide a 
unified language for defining notions of safe composition. The programming language and its semantics are 
formally defined as follows. 

2.1. Syntax 

Let n € N and P = {1, ... ,n} be a set of processes. Throughout the paper n will be reserved for denoting 
the number of processes. Let (Vfor,-); e p be mutually disjoint sets of program variables (of process i) not 
containing the name hj which is reserved for Vs communication history. Let Expr, be the set of arithmetic 
expressions over Var,-. Let L be propositional logic over atoms formed from expressions with equality "=" 
and less-than "<". We define a syntactic category Prg of programs: 

Prg3P ::= £ | x:=e | SND^ | RCV^ W | [<|>] \l | P*P \ P + P | P a 

where x S Var,-, e G Expr b i,j G P, and (j) G £ . 

The intuitive meaning of these constructs is as follows. The symbol £ denotes the empty program. It takes 

no time to execute. Assignment statement x:=e evaluates expression e and assigns its value to variable x. 

The SND^ J statement sends a message containing the value of e on the channel from i to j. Communication 

is asynchronous, and sending is non-blocking. The RCv{^' statement, however, blocks until a message 

arrives on the channel from i to j. It takes a message off the channel and assigns its content to x. The 

guard [(j)] expresses a constraint on the execution of the program: in a run of the program, (() must hold 

at this location. Guards take no time to execute. The program I is a guard-like constraint stating that all 

channels must be empty at this location. Formally, our propositional language L is not expressive enough 

to define \ as a guard because formulas are not capable of refering to channel contents. The operation "*" 

represents layered composition following Janssen et al. HJan95i Layering statements of distinct processes is 

essentially the same as parallel composition whereas layering of statements of the same process corresponds 

to sequential composition. We tend to omit "*" when no confusion will arise. The symbol "+" denotes 

nondeterministic choice. By P a we denote zero or more (possibly infinitely many) repetitions of program 
pfj 

2.2. Semantics 

A send record (for i) is a triple (i — > j',v), which records sending a message with contents v from i to the 
receiver j. Similarly, (j <— i,v) is a receive record (for j). A local state (for process i) is a mapping from 
Vari to values and from hi to a sequence of send and receive records for i. A local run (for process i) is 
an infinite sequence of local states. We identify an event (of i) with the transition from one local state in a 
local run of i to the next. An event is either a send, a receive, or an internal event. A (global) run is a tuple 
r = (( r f )i'eP ; 8 r ) of local runs — one for each process — plus an injective matching function 8 r associating a 
send event with each receive event in r. The mapping 8 r is restricted such thatQ 

2 In place of the silent cut ; the preliminary version of this paper |EM05a| used a phase quantifier x. Program IP roughly corre- 
sponds to our IPl. 

3 Using guards, choices, and repetition it is possible to define if <|) then P else Q fi as an abbreviation for [(|)]/'+ and 
while do P od for ([<])] _P) ra [^(])]. The results in this paper also hold for a language based on if and while instead of [.], +, 
and m . 

4 Our choice of execution model is closely related to the more standard one of infinite sequences of global states, representing an 
interleaving of moves by processes. Our conditions on 8 r guarantee the existence of such an interleaving. In general, each of 
our runs represents an equivalence class of interleavings. 
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1. If 8 r (e) = e' and e is a receive event of process j resulting in the appending of (j <— i, v) to fs message 
history then e' is a send event of process i appending the corresponding send record (i —>■ j,v) to fs 
message history. 

2. Lamport's causality relation A induced by 8 r on the events of r, as defined below, is an irreflexive 
partial order, hence acyclic. 

The first condition captures the property that messages are not corrupted in transit. The fact that the function 
8 r is total precludes the reception of spurious messages, whereas injectivity ensures that messages are not 
duplicated in transit. Further restrictions on 8 r can be made to capture additional properties of the commu- 
nication medium such as reliability, FIFO, fairness, etc. 

We say that r G Rel if no unmatched send event is succeeded by infinitely many matched send events on 
the same channel. 

In [Lam78] Lamport defined a "happened before" relation A on the set of events occurring in a run r of 
a distributed system. The relation —* is defined as the smallest transitive relation subsuming (1) the total 
orders on the events of process i given by the r,-, and (2) the relation { (^1,^2) I 8 r (^2) = e\ } between send 
and receive events induced by the matching function 8 r . 

2.2.1. Cuts and Channels 

Write N + for NU {°°}. A cut is a pair (r,c) consisting of a run r and a P-indexed family c = (c ! ) ;e p of 
N + -elements. We write "<" for the component-wise extension of the natural ordering on N + to cuts within 
the same run. A cut infinite if all its components are. 

Say that an event e performed by process i is in a cut (r,c) if e occurs in r,- at an index no larger than c,-, 
and e occurs outside of (r,c) if the index is larger than c,-. A cut (r,c) corresponds to the, possibly implau- 
sible, situation in which the events in the cut have occurred for each process i G P. We define the channel 
chari,-^- at a cut (r,c) to be the set of fs send events to j and fs receive events from i in (r,c) that are not 
matched by 8 r to any event also in (r,c). Finally, a formula fex holds at (r,c), and we write (r,c) |= (|), if 
(j) holds in standard propositional logic when, for each i G P, program variables in Varj are evaluated in the 
local states r, (c ; ) if c ; - is finite, and are considered unspecified otherwise]! 

Observe that a cut can, in general, be fairly arbitrary. In particular, there is no requirement that all 
messages that are received before a cut is reached were sent before the cut. This is deliberate. There are, of 
course, many instances in which more structured cuts may be of interest. Indeed, we can define a cut (r,c) 
to be consistent if every A predecessor of an event in the cut (r, c) is also in the cut. Moreover, in this work 
we make use of a stronger property of cuts — that all channels are empty at the cut. 

2.2.2. Semantics of Programs 

We define the meaning of programs by stating when a program occurs over an interval. An interval consists 
of two cuts (r,c) and (r,d) over the same run with c < d, which we denote for simplicity by r[c,d]. An 
event is in r[c,d] if it is in (r,d) but not in (r,c). We define the occurrence relation lh between intervals and 
programs by induction on the structure of programs. The interesting cases are those of * and l Formally, 
program P G Prg occurs over interval r[c,J], denoted r[c,d] lh P, iff|j 

r[c,d] lh £ if c = d. 

r[c,d] lh x :=e if d = c[i 1— ► c,- + 1] and r,-(<i;) = r,-(cf) [jc i— ► v], where v is the value of e in r;(c;). 
5 Recall that local states assign values to local variables. 

6 We shall denote by f[a b] the function that agrees with / on everything but a, and maps a to b. 
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r[c,d] lh SND^ ; if d = c[i h-> c; + 1] and r,-(<i;) = r,(c,-)[/j,- 1— > r,-(c/)(/i;) ■ ((/ — > j,v))], where v is the value of 
e in r,-(c/). 

r[c,d] lh RCV.r j if = c[i i-» c t + 1] and nidi) = n(ci)[hi i-» r ; (c;)(/j ; ) • ((i <- ;»),-* i-> v]. 
r[c,d] lh [<])] if c = d and (r,c) |= ((). 

r[c,rf] \\~li£c = d and no communication event in (r,c) is matched by 8 r with an event outside (r,c)0 
r[c,d] lh P* 2 if there exists d satisfying c <d <d such that r[c,c'] lh P and r[c',<i] lh Q. 
r[c,d] IhP + Qif r[c,d] lh P or r[c, d] lh Q. 

r[c,d] lh P m if, intuitively, an infinite or finite number (possibly zero) of iterations of P occur over r[e,d]. 
More formally, r[c,d] lh P m if there exists a finite or infinite sequence (d k ^)kei such that / is a non-void 
prefix of N+, c( ) = c, < for all k<k'Gl, \Jkei c{k) = d > and r[c(*),c(* +1 )] lh P for all Jfe.it +1 G /. 

The program semantics is insensitive to deadlocks because deadlocking executions are not represented by 
runs. We deliberately chose to ignore deadlocks to simplify the presentation and focus on the main aspects 



of composition. Whether a program deadlocks can be analyzed using standard techniques [Lyn96, p. 635fJ. 



General assumption. From now onward, we shall only consider programs that are deadlock-free. 
2.2.3. Refinement 

We shall capture various assumptions about properties of systems by specifying sets of runs. For instance, 
Rel is the class of runs with reliable communication, and RelFi is its subclass in which channels are also 
FIFO. 

Given a set T of runs, we say that P refines Q in F, denoted P <r Q, iff r[c,d] lh P implies r[c,d] lh Q, 
for all r G F and c,d G (N+) p . In other words, every execution of P (in a F run) is also one of Q, regardless 
of what happens before and after. Therefore, we may replace Q by P in any larger program context. This 
definition of refinement is thus appropriate for stepwise top-down development of programs from specifica- 
tions. The refinement relation on programs is transitive (in fact a pre-order) and all programming constructs 
are monotone w.r.t. the refinement order. 



3. Capturing Safe Composition 

The silent cut program I allows us to delineate the interactions that a layer can have with other parts of the 
program. When combined with refinement it is useful for defining various notions central to the study of 
safe composition, as we now illustrate. 



CCL. We can express that the program L is a CCL in the program P*L*Q w.r.t. F by: 

lP*L*Ql < r PlLlQ . 

In words, any isolated execution of P *L * Q will have the property that all communication in L is internal 
and hence L executes as in isolation. This definition is context-sensitive. 

7 I.e., no receive in the cut (r,c) is mapped by 8 ; - to a send outside of the cut, and no receive from outside is mapped to a send in 
the cut. 



7 



Barriers. More modular would be a notion that guarantees safe composition regardless of the program 
context. One technique to ensure that two consecutive layers do not interfere with each other is to place a 
barrier B between them. Formally, program B is a barrier in T if 

IP*B*Q < r PlBlQ ,forallP,<2. 

Traditionally, barriers have been used to synchronize the progression through phases by enforcing that no 
process could start its n + 1 st task before all the other processes had completed their n th tasks. This could be 
formilzed by requiring that, if r[c,c'] lh IP, r[c',d'] lh B, and r[d',d] lh Q, then all events in (r,c ; ) necessarily 
A-precede all events not in (r,d'), for all runs r G T, and programs P, Q. 

TCC. Some programs can be safely composed without the need for communication-closedness [EF82, 
JZ92]. Depending on the model T, there may be programs P that safely compose with all following layers. 
We say that P is tail communication closed (TCC) in F if, 

IP < r Pi . 

Thus, if P is TCC then any execution of P starting in empty channels will also end with all channels empty. 
Therefore TCC programs can be readily composed!! It is straightforward to check that the programs £, [<|>] , 
x := e, and Pi are TCC in any T. Moreover, if P and Q are TCC in T then so are P + Q, P * Q, and P w . 
Observe that every barrier B in T is in particular TCC in T. 



Seals. In many models of interest, only trivial programs are TCC. This is the case, for example, in Rel, 
as shown in Section [4] below. In such models, an alternative methodology is required for determining when 
it is safe to compose given programs. Next we define a notion of sealing that formalizes the concept of 
program S serving as an impermeable layer between P and later phases such that no later communication 
will interact with P. We say that S seals P in F if, 

IP*S < r PIS . 

Thus, if S seals P in T then neither S nor any later program can interfere with communication in P. If S seals 
P and Q seals S, then S will behave in IP * S * Q as it does in isolation. Sealing allows incremental program 
development while maintaining CCL-style composition. 

Lemma 1 1 . If both P and P' are sealed by S in T then so is P + P' . 

2. If both 5 and S' seal P in T then 5 + 5" (properly) seals P in T. 

3. If S seals P in T then S*Q seals P in T. 

4. If both S seals P and 5" seals S in T, then S' seals P * S in T. 

5. If P seals itself in Y then P seals in T. 

6. TCC subsumes sealing: P is TCC in T iff all programs seal P in T. 

It follows from this lemma that, if program P can be decomposed into a sequence of £ layers D- 1 ' ,L^\ 
and in addition L^ +1 ) seals LW for all 1 < < then each layer LW is a CCL in P. 

For example, as discussed in the introduction, any program of the form MT-' - " seals any program of the 
form mt'^ ; in Rel. Consequently, a program of the form MT'^ ; * MT^' seals itself in Rel. On the other 
hand, the shorter program MT'^ i does not seal itself in Rel — in an execution of mt'^ ; * mt'^ 7 the two 
messages sent by i could be received in the reverse order of sending. 

8 TCC follows and is closely related to the notion of GTCC introduced by Gerfh and Shrira |GS86|. The main difference is that 
their notion is defined w.r.t. a set of initial states. 
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Proper Seals. Suppose that P = {1, 2} and x; G Varj for i G P. Then the program Q = while ?rwe do (xi := 
5 *X2 := 17) od is TCC in RelFi, a CCL in Rel, and seals any program in Rel. For it necessarily diverges, 
that is, it occurs only over intervals r{c,d] with non-finite d. This implies that no layer following Q has any 
impact on the semantics of the whole program. It follows trivially that no communication of a later layer 
can interfere with anything before. Programs such as Q are not particularly useful as seals, in contrast to 
ones that seal without diverging. This motivates the following definition. We say that S is a proper seal of 
P in r if S seals P and S never diverges after P. That is, for all r G T and c,d,d', whenever r[c,d] lh IP, and 
r[d,d'] h IS and d is finite then so is d'. 

For instance, since MT ! ^ ; is a terminating program that seals MT J '"" in Rel, it is in particular a proper 
seal. 

4. Case Study: Safe Composition in Rel 

We now consider safe composition in the model Rel. Communication events can cause a program not to be 
TCC in Rel. For example, reconsider the program MT^i = SND^ ; * RCV^'. It is TCC in RelFi but not 
TCC in Rel. That mt'^ ; is not TCC in Rel is no coincidence. Next we show that no terminating program 
performing any communication whatsoever is TCC in Rel. 

Theorem 2 Ifr[c,d] lh P for some r G Rel and finite c,d such that all channels are empty in (r,c) and there 
is at least one send or receive event in r[c,d], then P is not TCC in Rel. 

Proof: Assume that r[c,d] lh P where r G Rel, c,d are finite, all channels are empty at (r,c) and there is 
a send or receive event in r[c,d]. If there is a non-empty channel in (r, d) the claim is immediate since a 
matching communication event following P could interact with P. Otherwise, every message sent in r[c,d] 
is received in r[c,d}. Since P is deadlock-free by the general assumption, there are processes whose last 
communication event in r[c,d] is a receive. W.l.o.g. let i be such a process and assume that its last receive is 
of a message v sent by j into variable x G Var^ 

A run r 1 G Rel that equals r up to d can be constructed such that r'[d,d'] lh lSND J e ~^' * RCv'£~ J l, where 
e evaluates to v in rj(dj). So the same message is transmitted twice between j and i. Let r" G Rel be 
the same as r 1 , except for 8,-, which swaps the matching send events between the two receive events. For 
Q = SND^' * RCV^~ j it follows that r"[c,d'] lh IP * Q\ but r"[c,d'] f iPlQl The claim follows. ■ 

Since a barrier is necessarily TCC we immediately obtain 

Corollary 3 No program can serve as a barrier in Rel. 

Having shown that TCC and thus barriers are not generally useful notions in Rel, we turn our attention to 
(proper) sealing. It is instructive that not all terminating programs can be properly sealed in Rel: 

Lemma 4 If P = { 1 , 2} then the program X = SND^f * SND^ 1 * RCV^ 2 * RCV^ 1 illustrated in Fig. |3(a)| 
cannot be sealed properly in Rel. 

Proof: Assume, by way of contradiction, that 5 properly seals X in Rel. Consider a run r G Rel such 
that r[(0,0), (2,2)] lh X and r[(2,2),d'] lh 5 where d' is finite. If some process i G P does not engage in 
any communication event in r[(2,2),d'] then S does not seal X since a send by process i performed at d\ 
potentially interacts with X. Otherwise, let e; be the first communication events of each process i = 1,2 in 
r[(2,2),d'}. If one of the e, is a send then, as before, this send can interact with X. Finally, if both e, are 
receives then S causes a deadlock, contradicting the assumption that r[(2,2),d'] lh 5. ■ 
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Our programming language Prg is Turing-complete. Since the halting problem for Prg can be reduced to 
sealability in Rel we obtain 

Theorem 5 Sealability in Rel is undecidable. 

Given this theorem we shall restrict our attention to more tractable subclasses of programs. Program P is 
balanced (in Rel) if, whenever r[c,d] lh P and all channels are empty at (r,c), then every channel contains 
an equal number of sends and receives at (r,d). Note that balanced programs are TCC in RelFi. The 
following theorem shows that in Rel balance is a necessary prerequisite for being properly sealable. 

Theorem 6 In REL, every non-divergent program that is properly sealable is also balanced. 

Proof: Let P and S be programs such that in Rel P does not diverge and S properly seals P. Assume by 
way of contradiction that P is not balanced. Let r € Rel and c,c' ,d be such that r[c,c'} lh P, r[c' ,d] lh S, all 
channels are empty in (r,c), and, w.l.o.g., chancy contains k sends and m receives at (r,c') where k^m. 
Since S is a proper seal, there is neither a send nor a receive event in r[c,c'] matched with an event not in 
r[c,c']. Since every receive event must be matched to some event by 8 r it follows that k > m, that is, there 
are more sends than receives on chancy in r[c,c']. No receive in the seal can be matched to any of those 
sends. There exist r' € Rel, y € Vary, and d' such that r 1 is the same as r up to d (hence r'[c,d] lh P*S), 
r'[d,d'] lh RCV^ ! , and 8 r maps the receive event rj(dj) to one of the send events of P that are unmatched in 
r. This match contradicts the assumption that S properly seals P. ■ 

Program P is said to close chancy (in Rel) if chancy is empty after P in any execution of P starting at a 
cut with empty channels. More formally this is expressed as follows. For all r 6 Rel, if r[c,d] lh IP then 
chancy is empty in (r,d). A channel that is not closed is open. The state of a program's channels is the 
essential element in determining sealability. 

Program P is straight-line if it contains neither nondeterministic choices nor loops nor guards. In other 
words, P is built from sends, receives, and assignments using layering only. Our focus in this section is on 
balanced straight-line programs, or BSL for short. 

The program graph of a BSL P is a graph (V, E) that has a node for every send and receive event in P plus 
an initial dummy node FST, and a final dummy node LST, for each process i. The edge set E consists of the 
successor relation over events in the same process extended to the dummy nodes plus an edge between the 
fc'th send and the &'th receive on channel chancy, for all k, i, and j. All the graphs in Figures [2] and [3] are 
program graphs. The size of a BSL P's program graph is of the order of the size of the program. 

Next we investigate the connection between program graphs and Lamport causality. We use E + to refer 
to the irrefiexive transitive closure of E and call edges not containing dummy nodes normal. The subset of 
normal edges is denoted by Ne- In RelFi, the normal edges induce the full causality relation on the events 
of the program. As we shall show, in Rel the normal edges of a program graph are also —>■ edges. 

Lemma 7 Let r € Rel. Let P be a BSL with program graph (V,E). If r[c,d] lh IP then N E Q 

Proof: The only interesting normal edges are those between sends and receives of different processes. 
Consider the edge (ei,ea) £ E between the &'fh send and the &'fh receive on chancy. Let r G Rel such that 
r[c,d] lh IP and assume that e-i = h r (e2) is the £'th send on chancy in P. We need to show that e\ A e 2 . By 
definition of A, we have that e^ — > e%. If i = k then e^ = e\ and we are done. If I > k then e\ — > e^ because 
e\ is an earlier event of i than e^ and the claim follows by transitivity of A. Finally, suppose that I < k. This 
case is illustrated in Fig.0] Consider the k— 1 receives on chancy that precede e<i. They are all matched in r 
to sends by i. Since £ < k and £3 is already matched to e2, one of these receives, say must be matched to 
a send event e$ that does not precede e\. Since e$ A e$ and <?4 A ei, it follows that e\ — > e%, as desired. ■ 
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Figure 4: The case £ < k in the proof of Lemma[7] 



Lemma |7] implies that all edges in {Ne) + will be A edges in every run r G Rel of \P * Q. We note that 
(Ne) + is the largest set of edges with this property, because (Ne) + = (— > D V 2 ) if r G RelFi. 

A more concise representation than the program graph is called the signature ofP and denoted by Sig(P). 
It has size 0(n 2 ) while preserving the information necessary to decide what channels are left open, respec- 
tively closed, by P. Given the program graph (V,E) of a BSL P we can obtain Sig(P) as follows. After 
calculating £ + , we remove all nodes except for the dummy nodes and the first send and last receive on each 
channel. The graph is further reduced by removing the node SND'^ 7 whenever (FSTy, SND'^ j ) G E + . Simi- 
larly, RCV iVi is removed whenever (rcv- 7 ^', LST,-) G E + . The sends and receives remaining in the signature 
are precisely the ones that could interfere with receives in a preceding layer or with sends in a succeeding 
layer. 

The complexity of computing Sig(P) is in 0(||P|| 3 ) since it requires the causality relation obtained as the 
transitive closure of the edge relation of P's program graph. We remark that for BSLs P and Q, Sig(P * Q) 
can be obtained from their respective signatures at a cost of 0{n 2 ). 

Let P be a BSL and let G = (V,E) be SlG(P). Then P leaves channel chancy open iff RCV ;W G V. 
For instance, the program mt'^ 7 leaves chancy open — there is a node RCV ; ^' in Sig(mt ! ^ ; ), which 
is depicted in Fig. 5(a) As we have shown earlier, MT^' seals MT'^ in Rel, which implies that MT^' 
closes chancy once. Since MT ; ^' does not re-open the channel, the RCV ; ^' node found in Sig(mt'^ ; ) is 



not present in the Sig(mt'^ * mt j ^') shown in Fig. 5(b) 




(a) SlG(MT''^>) (b) SlG(MT'^'*MT-'^ i ) 



Figure 5: Examples of signatures. Thin arrows denote transitive causality edges. 



4.1. Deciding Sealing 

Whether one BSL seals another can be decided on the basis of their signatures. Suppose BSL P leaves 
chancy open and Q seals P. Then, if Q sends on that channel, then P's last receive RCV 7 ^ 1 on the chan- 
nel must causally precede <2's first send SND ! ^ on it. Otherwise, Q must ensure that any later send on 
chancy is causally preceded by P's last receive. This is guaranteed exactly if P's signature contains an edge 
(rcv 7 ^ ! ,LST£) and <2's signature contains an edge (FST^LST,), for some k G P. (See Fig. [6]) 
Based on the above observation the following theorem characterizes sealing among BSLs. 
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(a) Channel chan,-_»y left open by P and a causal- 
ity edge to LST/.. 



(b) Sealing the channel by causality. The dashed 
part accounts for Q sending on chan,^. 



Figure 6: Excerpts of the signatures of BSLs P and Q. 



Theorem 8 Let P and Q be BSLs and let (Vp,Ep) = Sig(,P) and (Vq,Eq) = Sig(<2). Then Q properly seals 
P iff, for all RCV 7W G V P , there exists k G P such that (RCV- / ' <- ',LST Jt ) G E P , (FSTj^LST,) G Eq, and, if 
SND'^' G Vq then (FST>, SND 1 ' - ^) G Eq. 

Proof: "<^" Consider the channel chan,-_»y. By construction, there is a node RCV jV ' G Vp precisely if the 
channel is not closed by P. Suppose that (RCV^^LST^) G E P and (FST^,e) G Eq where e = SND'^j if 
SND'^ G Vq and e = LST,- otherwise. Let r G Rel and c, d be such that r[c,d] lh ?P * g. Let c' be such that 
r[c,c'] lh P and r[c',rf] lh (?. Let e R in r[c,c'] be a RCV-' <_! event. We shall prove that e s = 8 r (e R ) is also in 
r[c,c']. By definition of — > we have that e s — > e R . First observe that e s cannot be in (r,c) since r[c,d] \\~\P*Q 
implies that 8 r cannot map e R to an event in (r,c). Second, e s cannot come after (r,c 7 ) because, as we shall 
show, that would imply e R — ► e s . By transitivity, we would obtain e R —> e R , contradicting the irreflexivity 
of A. Assume by way of contradiction that e s is not in (r,c'). If e s is in r[c' ,d], that is, generated by Q, 
then SND ! ^ ; G Vq represents a send event e' s . This event is causally preceded by e R because (e R ,LST^) G E P , 
(FST,t,SND ! ^ 7 ) G Eq, and e s = e' s or e' s — > e s . Otherwise, that is, if e s is not in (r,d), it is causally preceded 
by e R because (e R ,LST^) G E P , (fst^,lst,-) G Eq, and e s is causally preceded by the last event of process / 
in r[c,d]. In either case, e R — » e s follows by transitivity. 

By now we have shown that 8 r does not match any receive in r[c,c'] to a send event not in r[c,c']. Because 
P is balanced this implies that all send events in r[c,c'] (i.e., the ones generated by P) must be matched with 
receive events in that interval. Thus, also r[c,d] lh IPIQ. 

"=^" Suppose that RCV 7< ~' G V P and that there is no k such that (rcv^Sfst^) G E p and (FST^e) G Eq 
where e = SND'^ ; if SND'^ 7 G Vq and e = LST, otherwise. We show that Q does not properly seal P. First 
consider the case e = SND'^. For lack of a causal relationship between RCV jV| G Vp and SND'^ G Vq 
they can be matched in an interval r[c,d] over which lP*Ql occurs, violating the sealing property. Finally 
consider the remaining case, e = LST,. Again, for lack of a causal relationship between RCV^ ! G V P and 
LST,- G Vq, a subsequent send event can be matched with RCV 7 ^' G V P , that is, there exist r G Rel and 
c,d such that r[c,d] lh IP * Q * SND^? and 8 r matches the last receive on channel chan,_, 7 - in P with the 
SND^'. ■ 

Given the theorem above, the complexity of deciding whether Q seals P, given their signatures, is obviously 
determined by the size of P's signature, which we recall is 0(n 2 ). 

4.2. A Characterization of Sealability 

Observe that the set of channels closed by a BSL P when executed from a cut with empty channels is 
uniquely determined by P and can be derived from its signature. We can thus associate a closed-channel 
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graph with each BSL . Formally, the closed-channel graph Cp = (P,£p) of a BSL P is given by e Ep 
iff i ^ j and chan,-_»y is closed by P in Rel. In the following we denote the undirected version of a graph G 
by G u . 

Theorem 9 (Sealability) Let P be a BSL. Then P can be sealed properly in Rel iff Cp is connected. More- 
over, ifP is properly sealable in REL then it can be sealed by a BSL that transmits less than 3n messages. 

Proof: "=>" Suppose that C P is not connected. Then P can be partitioned into two non-void sets, A and 
A, such that there is no channel closed by P between (elements of) the two sets. Assume, by way of 
contradiction, that the program S properly seals P. Since S is a seal, every message sent in S along a channel 
not closed by P must be causally preceded by all receives on that channel in P. This holds in particular for 
all channels between A and A. There must be such receives in P for each of the channels not closed by P. 
To establish the causal precedences, S must transmit messages. Unless S transmits messages between A and 
A, it cannot seal P. Consider one of the causally minimal sends of such a transmission in S. It can interfere 
with the last receive on that channel in P. Consequently, S does not seal P. 

"■<=" The algorithm sketched as Seal(P) takes a BSL P as input and outputs a proper seal for P if P is 
properly sealable. 

Seal(P) 

1 (V,E) <— Closed-Channels (P) > This algorithm is presented in AppendixE] 

2 5^£ 

3 pick T C E s.t. (P, r) u forms a spanning tree of V 

4 v <— a node at the center of T 

5 for (w,w') G T pointing away from v s.t. (w',w) £ E 

6 doS^S*MT w ^ w ' 

7 add a converge-east in T to 5 

8 add a broadcast in T to S 

Let S be the result of Seal(P). It consists of less than In instances of MT because every spanning tree 
contains n — 1 edges and each of the three sub-phases, (a) lines [5]-[6l (b) the converge-east, and (c) the 
broadcast transmits less than n messages. Each one of these MT instances transmits a message along a 
channel that is closed at the time of transmission. For phase (a) this follows from the selection criterion 
for these transmission in line [5] Phase (a) establishes that all channels between a node and its parent in the 
spanning tree are closed, thus phase (b) transmits on closed channels only. Similarly, phase (b) closes all 
channels between nodes and their children in the spanning tree, hence also phase (c) transmits on closed 
channels only. Finally, we need to show that every channel left open by P is closed at least once by S. Let 
be such that P leaves chan,-_>y open. If (i,j) € T~ l then phase (a) closes the channel by sending on 
chany_>j. Otherwise it is closed transitively by the subsequence of the converge-east from j to the root v 
followed by the subsequence of the broadcast from v to i. ■ 

Observe that Seal(P) constructs a tailor-made tree barrier S between P and any later program. 

Example 10 Consider a phase L = *,- £ pL,-. In L each process i ^ 1 sends a message to every other process 
k £ {1,/} before receiving the n — 2 messages sent to it in this phase. Finally, process i transmits a message 
to process 1. We can define process fs program L, more formally by 

Li = (^h^SND^^^ij^RCV'^^SND^ 1 . 
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Process 1 in turn receives those messages sent last in the L,-, that is: 

U = *,yiRCV lw 

Executing L beginning with empty channels leaves n 2 — 3n + 3 channels open. Nevertheless, L can be sealed 
efficiently by the program 

S = *j^i(SND 1_>I *RCV ! ^ 1 ) , 
which transmits n — 1 messages. (See Fig.|7]for the program graph of S.) 

Q^(J™Q^ ■■■ ~ K snd > — ►Q 
r> — ► r^n ^ *Q 

o m o 

Figure 7: transmissions close £2(« 2 ) open channels. 



5. Conclusion and Future Work 

A subtle yet crucial issue in developing distributed applications is the safe composition of smaller programs 
into larger ones. The notion of CCL captures when a program works as if it were executed in isolation in the 
context of a given larger program. The literature on CCLs focused mostly on reliable FIFO communication. 
In that setting programs can be designed that are inherently CCLs in any program context. 

Observe that neither termination detection nor barrier-style techniques can be applied in Rel without 
careful inspection of the surrounding program context. Any such mechanism will form a layer in the re- 
sulting program which in turn must be shown to safely compose with the other layers. A popular approach 
to running distributed applications on non-RELFl systems is to construct an intermediate data-link layer 
providing RelFi communication to the application. This typically involves sealing every single message 
transmission from interference by previous and later layers. Popular algorithms for data-link achieve this by 
adding message headers and/or acknowledging every single message, thereby incurring a significant over- 



head jAAF + 94[ IWZ891 . As we show for Rel, it is often possible to do better than that. Our analysis 
of sealing can be used to add the minimal amount of glue between consecutive layers to ensure that they 
compose safely, without changing the layers at all. 

We have introduced a framework for studying safe program composition. It facilitates the formal defi- 
nition of standard notions such as CCL, barriers, and TCC. Gerth and Shrira showed that — as a context- 
sensitive notion — CCL is unsuitable for compositional development of larger systems from off-the-shelf 
components. As we have shown, neither barriers nor TCC layers are useful for such development in Rel, 
that is, when communication is reliable but not FIFO. In another paper [EM05b], we use essentially the 
same framework to investigate safe composition in models with message duplication or loss. Barriers and 
TCC layers are also absent in those models. The framework introduced here is used to define two more 
notions, namely fitting after and separating, that are more readily applicable in those models^ We illustrate 
our approach by applying it to the case of Rel. Notably, the approach allows for seamless composition of 
programs without need for translation or headers. 



9 We say that P fits after Q if IQP <vlQlP- Program S separates P from Q if IP * S * Ql < r PlSlQ. 
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The central notion introduced and explored in this paper is that of one program sealing another. Larger 
programs can be composed from smaller ones provided each smaller program seals its predecessor. For 
instance, recall that MT'^^MT^' seals itself in Rel. Lemma fTlBl can be used to show that a program of the 
form while true do mt'^ ; * MT-' - " od can serve to transmit a sequence of values from i to j in Rel. Indeed, 
if the return messages from j to i are not merely acknowledgments, it can perform sequence exchange. The 
notion of sealing in Rel is shown to be intimately related to Lamport causality. Based on this connection, 
we devise efficient algorithms for deciding and constructing seals for the class of straight-line programs. 
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A. Algorithms 

Program-Graph (P) 

1 V,E <— U^pjFSTijLST,},© > First and last dummy nodes for each process 

2 f <—Xi: P.FST; > Book keeping for local precedence 

3 > Add sends and receives with local precedence 

for e in P from left to right where e is of the form SND'^ or RCV ! ^ 
doV,E,f(i)^VU{e},EU{(f(i),e)},e 

4 > Add precedence between last /-event and i's last dummy node 
for i e P 

do E < ELI {(/(?), lst,)} 

5 > Add precedence between FIFO matching sends and receives 
for e G V the k'th event in P of the form SND ! ^ 7 for some i,j,k 

do E <— E U {(e,e')} where e' is the k'th RCV i<_ -' event in P 

6 return (V,E) 
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do e 



Deadlock-Free (P) 

1 V,E <— Program- Graph (P) 

2 return 3 cycle in E 

SlG(P) 

1 V,E <- Program-Graph (P) 

2 E <— E + > Add irreflexive transitive closure 

3 D> Remove all but minimal sends and maximal receives on open channels 

V ^V\{ e e is a SND'^ 7 event preceded by another such send or FSTy } 

V ^V\{ e <? is a RCV jVi event that precedes another such receive or LST,- } 

4 return (V,EnV 2 ) 

Is-Seal(P,£) 

1 V p ,Eq^Sig(P) 

2 V q ,E q ^Sig(Q) 

3 for (i, j) G P 2 \ id P s.t. RCV^ G V P 
SND'^ ; ' if SND'^ j G Vq 
LST; otherwise 

5 safe <— false 

6 forifc€P\{i} 

7 do safe <- safeV ((RCV- 7 '^ ,LST k ) G £> A (FST fe , SND' - ^') G E Q ) 

8 if -ija/e 

9 then return /a/se 
10 return true 

Closed-Channels (P) 

1 V,£ ^P,P 2 \id P 

2 V",E'«-SlG(P) 

3 for(j,y)G£ 

4 do if RCV^' G V and (rcv^LST,-) ^ £" 

5 then£<-£\{(ij)} 

6 return 

Signature-Compose (V p ,E p ,Vq, E q ) 

1 D> Sequentially compose the two signatures 

V «- { | e G Vx AX G {P,<2} } 

£ <- { (e«,/ y )) G V 2 | X = F A (e,f) G } U { (lsT^FStP) | i G P } 

2 

3 > Remove dummy nodes between the two signatures 



V «- V \ { lst| p) j g P } \ { fst| 2) 



/ G . 



4 > Remove all but the first sends and last receives 

V <- V \ { G V | <?( p ' G V Ae = SND'^> } \ { e^ G V | G V Ae = RCV^'' } 

5 > Remove sends and receives on closed channels 

V ^V\{e^ eV | (FSTy,«(0) G£Ae = SND ! '^' } \ { GV | (e^LST,-) G £ Ae = RCV^' } 

6 rename by dropping superscripts (X) 

7 return (V,£nV 2 ) 
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